Joint 3D Proposal Generation and Object Detection from View Aggregation

نویسندگان

  • Jason Ku
  • Melissa Mozifian
  • Jungwook Lee
  • Ali Harakeh
  • Steven Lake Waslander
چکیده

We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produces state of the art results on the KITTI 3D object detection benchmark [10] while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ObjectNet3D: A Large Scale Database for 3D Object Recognition

We contribute a large scale database for 3D object recognition, named ObjectNet3D, that consists of 100 categories, 90,127 images, 201,888 objects in these images and 44,147 3D shapes. Objects in the 2D images in our database are aligned with the 3D shapes, and the alignment provides both accurate 3D pose annotation and the closest 3D shape annotation for each 2D object. Consequently, our datab...

متن کامل

Dataset Curation through Renders and Ontology Matching

Research Interests I am interested in Computer Vision and Machine Learning, specifically some of the problems I find interesting are deep learning, fine grained classification, object detection, and viewpoint estimation. My research experience includes: deep learning, fine grained visual classification of businesses in street view imagery, Computer Graphics based data generation for Computer Vi...

متن کامل

3D Scene and Object Classification Based on Information Complexity of Depth Data

In this paper the problem of 3D scene and object classification from depth data is addressed. In contrast to high-dimensional feature-based representation, the depth data is described in a low dimensional space. In order to remedy the curse of dimensionality problem, the depth data is described by a sparse model over a learned dictionary. Exploiting the algorithmic information theory, a new def...

متن کامل

A New 3D Object Pose Detection Method Using LIDAR Shape Set

In object detection systems for autonomous driving, LIDAR sensors provide very useful information. However, problems occur because the object representation is greatly distorted by changes in distance. To solve this problem, we propose a LIDAR shape set that reconstructs the shape surrounding the object more clearly by using the LIDAR point information projected on the object. The LIDAR shape s...

متن کامل

Joint Object Class Sequencing and Trajectory Triangulation (JOST)

We introduce the problem of joint object class sequencing and trajectory triangulation (JOST), which is defined as the reconstruction of the motion path of a class of dynamic objects through a scene from an unordered set of images. We leverage standard object detection techinques to identify object instances within a set of registered images. Each of these object detections defines a single 2D ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1712.02294  شماره 

صفحات  -

تاریخ انتشار 2017